Improving the representation of time structure in front-ends for automatic speech recognition

نویسنده

  • Wendy J. Holmes
چکیده

This paper describes investigations into the use of ‘excitationsynchronous’ spectral analysis to provide acoustic features for automatic speech recognition. Within each 10 ms frame the region of maximum power is located and used as the centre for the window in a subsequent Fourier transform. The method has been found to be effective in locating stop bursts and vocal-tract responses to glottal closures. This excitation-synchronous analysis has been compared with the more conventional fixedinterval analysis for window lengths ranging from 5 to 25 ms. In connected-digit recognition experiments using mel-cepstrum features, the excitation-synchronous analysis with a window length of 10 ms gave a 10% improvement in recognition performance when compared with the best of the fixed-window conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

IMPROVING ASR PERFORMANCE FOR REVERBERANT SPEECH Brian

The performance of current automatic speech recognition (ASR) systems is very sensitive to the presence of room reverberation in the incoming speech signal. We investigate a family of front-end speech representations that focus on slow changes in the the gross spectral structure of speech for their ability to improve the robustness of ASR systems to reverberation. A number of the front ends pro...

متن کامل

Improving Asr Performance Forreverberant

The performance of current automatic speech recognition (ASR) systems is very sensitive to the presence of room reverberation in the incoming speech signal. We investigate a family of front-end speech representations that focus on slow changes in the the gross spectral structure of speech for their ability to improve the robustness of ASR systems to reverberation. A number of the front ends pro...

متن کامل

On the comparison of front-ends for robust speech recognition in car environments

In this paper we compare several front-ends for Automatic Speech Recognition systems operating under noise conditions. The analyzed front-ends are based on standard MFCC parameterizations and include methods to compensate the effect of the noise over the representation of the speech signal. Three different compensation methods are considered in this work: Cepstral Mean Normalization, Spectral S...

متن کامل

A High-Dimensional Subband Speech Representation and SVM Framework for Robust Speech Recognition

This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) frontend that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000